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ABSTRACT 

Current and future weak lensing surveys will rely on photometrically estimated 
redshifts of very large numbers of galaxies. In this paper, we address several differ- 
ent aspects of the demanding photo-z performance that will be required for future 
experiments, such as the proposed ESA Euclid mission. It is first shown that the pro- 
posed all-sky near-infrared photometry from Euclid, in combination with anticipated 
ground-based photometry (e.g. PanStarrs-2 or DES) should yield the required preci- 
sion in individual photo-z of c z (z) < 0.05(l+z) at Iab < 24.5. Simple a priori rejection 
schemes based on the photometry alone can be tuned to recognise objects with wildly 
discrepant photo-z and to reduce the outlier fraction to < 0.25% with only modest loss 
of otherwise usable objects. Turning to the more challenging problem of determining 
the mean redshift (z) of a set of galaxies to a precision of |A/ Z \| < 0.002(1 + z) we 
argue that, for many different reasons, this is best accomplished by relying on the 
photo-z themselves rather than on the direct measurement of (z) from spectroscopic 
redshifts of a representative subset of the galaxies, as has usually been envisaged. We 
present in an Appendix an analysis of the substantial difficulties in the latter approach 
that arise from the presence of large scale structure in spectroscopic survey fields. A 
simple adaptive scheme based on the statistical properties of the photo-z likelihood 
functions is shown to meet this stringent systematic requirement. We also examine 
the effect of an imprecise correction for Galactic extinction on the photo-z and the 
precision with which the Galactic extinction can be determined from the photometric 
data itself, for galaxies with or without spectroscopic redshifts. We also explore the 
effects of contamination by fainter over-lapping objects in photo-z determination. The 
overall conclusion of this work is that the acquisition of photometrically estimated 
redshifts with the precision required for Euclid, or other similar experiments, will be 
challenging but possible. 

Key words: galaxies: distances and redshifts- cosmology: observations- methods: 
statistical. 



1 INTRODUCTION 

Large scale mapping of the weak lensing shear field in three 
dimensions is emerging as a potentially very powerful cos- 
mological probe (Peacock et al., 2006; Albrecht et al., 2006). 
Weak lensing has the advantage of directly tracing the mass 
distribution, thereby bypassing much of the complex astro- 
physics of the baryon component that underpin most of the 
other probes and which may well dominate the systematic 
uncertainties in them. In contrast, the underlying physics of 
weak lensing is extremely simple, and the challenges are pri- 
marily on the observational side, particularly the accurate 
measurement of the weak lensing distortion and the estima- 
tion of distances to very large numbers of faint galaxies. 

* E-mail: rongmonb@phys.ethz.ch(RB) 



A weak lensing survey of half the sky (20, 000 deg 2 ) to a 
depth of Iab ~ 24.5 and with a PSF of ~ 0.2 arcsec, forms a 
major part of the proposed ESA Euclid mission^ . Euclid had 
its origins in two proposals submitted for the first round of 
the ESA Cosmic Visions 2015-2025 competition, the DUNE 
imaging survey (Refregier et al., 2006) and the SPACE spec- 
troscopic survey (Cimatti et al., 2009). The combination of 
the two surveys, plus the anticipated improved information 
on the Cosmic Microwave Background from Planck^ offers 
dramatic improvements in our knowledge of the entire dark 
sector, including the definition of the dark matter power 

T http:/ /www. ias.u-psud.fr/imEuclid 

http: / /sci.esa.int/science-e/www/area/index.cfm?fareaid=102 
t www.rssd.esa.int/Planck 
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spectrum, the dark energy equation of state parameter w, 
as well as much else. 

Application of weak lensing for cosmology requires at 
least a statistical knowledge of the distances, i.e. redshifts, 
of large numbers of individual galaxies. At Iab < 24.5, there 
are about 2.5 billion galaxies in the Euclid 27r sr survey area 
and so, realistically, reliance must be made on photometri- 
cally estimated redshifts (hereafter photo-z). 

1.1 Required redshift precision for precision 
cosmology with weak lensing 

Several papers have discussed the redshift precision that is 
needed for weak lensing analyses to enable the full potential 
of this approach to be exploited (Amara & Refregier, 2007; 
Ma et al., 2006; Abdalla et al., 2008). 

In the lensing tomography approach (Hu, 1999), indi- 
vidual galaxies are binned into a number of redshift bins. 
The shear signal is extracted from the cross-correlation of 
the shape measurements of individual galaxies in different 
redshift bins. These correlated alignments then give informa- 
tion (with some distance weighting function) on the mass 
distribution between the observer and the nearer of the 
two redshift bins. Redshift information for the galaxies is 
required at two conceptually distinct steps: first, the con- 
struction of the redshift bins used for the cross-correlation 
analysis to extract the weak lensing signal and, second, the 
estimation of the mean redshift of the galaxies in a given 
bin, which is required to map the results onto cosmological 
distance and thereby extract the cosmological parameters. 
It is, of course, possible to do a similar correlation analy- 
sis with unbinned data (Castro et al., 2005; Kitching et al., 
2008), but for our purposes this distinction is unimportant. 

The cross-correlation between different redshift bins is 
undertaken to exclude any galaxy pairs that may be physi- 
cally associated, i.e. have the same distance. This is to avoid 
the possibility that physical processes operating around in- 
dividual galaxies may produce an intrinsic alignment of the 
galaxies that may be mistaken for the coherent alignment 
produced by the weak lensing of the foreground mass dis- 
tribution. The required accuracy of the individual photo-z 
for the bin-construction task is set by the need to exclude 
overlaps in the N(z) of individual bins (or the probability dis- 
tribution for individual galaxies) and thereby remove phys- 
ically close pairs (King & Schneider, 2002). This typically 
sets a requirement on the precision of individual photo-z of 
about g 2 = 0.05(1 + z) (Bridle & King, 2007). 

There is a second type of intrinsic alignment effect (Hi- 
rata &: Seljak, 2004), whereby the shape of the further of a 
given galaxy pair may be affected, through lensing effects, 
by the shape of the matter distribution around the nearer 
galaxy, which is likely to be correlated with the visible shape 
of that galaxy, thereby again producing a correlated align- 
ment of the two galaxies that is unfortunately nothing to 
do with the lensing signal from the common foreground. 
Joachimi & Schneider (2008), Joachimi & Schneider (2009) 
have shown that it is possible to implement a nulling ap- 
proach to eliminate this second intrinsic alignment signal, 
which again requires a priori knowledge of individual red- 
shifts. 

Once the weak lensing signal is extracted, accurate 
knowledge of the redshift of the galaxies, as with any cos- 



mological probe gives, amongst other parameters, informa- 
tion on the angular diameter distance D$(z). The sensitivity 
of De(z) to the relevant cosmological parameters (f2 m , Oa, 
w etc) gives the required accuracy in the mean redshifts 
that are required to achieve a given precision on the pa- 
rameters. As an example, Peacock et al. (2006) have shown 
that a precision of 1% in w requires a typical precision in 
the mean redshift of about 0.2% in (z). The Euclid goal 
is a 2% precision in w (independent of priors), requiring a 
precision of order 0.002(l+z) in (z). This simple approach 
is confirmed by extensive analysis of the Fisher matrices 
(Amara & Refregier, 2007; Ma et al., 2006). It is generally 
the case that if the mean redshift of a bin is defined accu- 
rately enough, then the higher moments of N(z) within the 
bin will also have been sufficiently determined. Of course, 
systematic biases in (z) that vary smoothly with redshift 
are particularly troublesome as they will mimic the effect of 
changing the cosmological parameters. 

In summary, in order to reach the Euclid performance, 
we require a statistical (random) r.m.s. precision of order 
0.05(1 + z) per galaxy (for the correlation analysis), and 
a systematic precision in the mean z in each bin of order 

0. 002(l+z). These are both quite demanding requirements, 
and together with the shape measurement itself (£7 ~ 3 x 
10~ 4 - (Bridle et al., 2009; Amara & Refregier, 2008)), they 
represent one of the observational challenges that lie along 
the path to enabling precision cosmology with weak lensing. 

Fortunately, there are some mitigating features of weak 
lensing analysis. For instance, the analysis is robust (aside 
from root-n statistics) to the exclusion of individual galaxies, 
provided only that the exclusion is unrelated to their shapes. 
One is free therefore to reject galaxies that are likely to have 
poor photo-z provided that they can be recognized a priori, 

1. e. from the photometric data alone. 

1.2 Challenges for the spectroscopic calibration of 
N(z) 

Given the stringent requirements on the systematic error in 
the mean redshift (z) of a particular bin, one approach (Ab- 
dalla et al., 2008) is to define the N(z) and mean {2} through 
the acquisition of spectroscopic redshifts for a representative 
subset of the galaxies. This direct sampling approach is cer- 
tainly the most conservative, but will be very challenging, 
in practice, for the following reasons. 

First, one clearly requires very large numbers of spec- 
troscopic redshifts. If we have a total redshift interval of Az, 
split into m bins, then the number of spectroscopic redshifts 
N will be of order: 

N ~ m-\Az/a (z} ) 2 (1) 

This assumes that the photo-z are perfect, and that there are 
no outliers. This leads trivially (Amara & Refregier, 2007) 
to a requirement for 10 5 — 10 6 spectroscopic redshifts. 

Secondly, these spectroscopic redshifts must be fully 
representative of the underlying sample. Any biases in the 
sampling of the bin or, even harder to reliably quantify, 
the almost inevitable biases in the ability to secure a re- 
liable spectroscopic redshift, must be dealt with via a pos- 
sibly complex and inevitably somewhat uncertain weighting 
scheme (Lima et al., 2008). It should be noted that current 
routine spectroscopic surveys of typical faint galaxies do not 



Photo-z Performance for Precision Cosmology 3 



even approach 100% completeness, even at brighter levels. 
One of the best to date is the zCOSMOS survey (Lilly et al., 
2007) on relatively bright Iab < 22.5 galaxies which yields, 
at present, a 99% secure redshift for 95% of galaxies at its 
optimum 0.5 < z < 0.8 (Lilly et al., 2009). Most other sur- 
veys are significantly less complete. 

Even more invidious are the effects of large scale struc- 
ture in the spectroscopic survey fields, often called cos- 
mic variance. Our own semi-empirical analysis (see the Ap- 
pendix) of the COSMOS mock catalogues (Kitzbichler & 
White, 2007) shows that, in a given patch of sky, the N(z) 
at Iab ~ 24 becomes dominated by cosmic variance as soon 
as a rather small number of galaxies have been observed. The 
precise number depends on the field of view of the spectro- 
graph, but is typically about 20-100 for spectroscopic fields 
of order 0.02-1 square degrees, i.e. a sampling rate of only a 
few percent. This means that the spectroscopic survey must 
be split up over a very large number of independent fields 
and that to get 10 redshifts that are Poisson variance dom- 
inated one must effectively cover the whole sky in a sparse 
sampled way. This is unlikely to be efficient with the large 
telescopes needed for such faint object spectroscopy. A sim- 
ilar concern comes from the effects of Galactic extinction 
and reddening, which are likely, even when corrected for, 
to require spectroscopic sampling across the full range of 
Galactic [b,l]. 

These difficulties prompt consideration of other ap- 
proaches, and in particular, that of placing greater reliance 
on the photo-z themselves, not only to construct the bins, 
but also to define their (z) with small systematic error. 

1.3 Using photo-z for construction of N(z) 

The performance of photometric redshifts is continually im- 
proving. For example, in the COSMOS field (Scoville et al., 
2007), where we have very deep 30-band photometry from 
the ultraviolet (G ALEX) to 5pim, several photo-z schemes 
now achieve a precision of a z ~ 0.01(1 + z), with an outlier 
fraction (in non-masked areas and defined as a redshift dif- 
ference greater than 0.15(1 + z spec ) of < 1% (Ilbert et al., 
2009)), at Iab < 22.5 and 0.05 < z < 1.4, where the photo- 
z can be checked with about over 10,000 spectroscopic red- 
shifts from zCOSMOS. It should be noted in passing that 
these 30 bands represent a very inhomogeneous data set in 
terms of point spread function, etc., and so this impres- 
sive photo-z performance also demonstrates the feasibility 
of combining disparate data into homogeneous photomet- 
ric catalogues. Of course, this outstanding performance in 
the COSMOS field is unlikely to be achieved over the whole 
sky for the foreseeable future because of the expense of the 
required multi-band photometry. Nevertheless, the demon- 
stration of this performance in COSMOS suggests that we 
have not yet reached any fundamental limit to photo-z per- 
formance. 

There are a number of different approaches to photo- 
z estimation that can be broadly distinguished between 
template-matching and more purely empirical approaches, 
such as Artificial Neural Networks (Collister & Lahav, 2004) . 
These have complementary strengths. Template fitting is 
based on the observed limited dimensionality of galaxy spec- 
tral energy distributions plus an astrophysical knowledge of 
the effects that can modify them, e.g. the the redshift itself, 



and the effects of extinction in our own Galaxy or in the dis- 
tant galaxy. The more empirical approaches in essence avoid 
any such assumptions, which is both a strength and a limita- 
tion. Although both approaches have passionate adherents, 
our own view is that both approaches can normally per- 
form equally well and that both are normally limited by the 
quality of the available data. In practice both use elements 
of the other, e.g. in template fitting, the actual data can 
be used to adjust the templates and the photometric zero- 
points, somewhat blurring the distinction. Finally, it should 
be noted that both can produce a likelihood distribution in 
redshift space through the application of priors (Bolzonella 
et al., 2000; Collister & Lahav, 2004; Bemtez, 2000; Bram- 
mer et al., 2008). In this paper we will base our analysis 
on a template-fitting algorithm (ZEBRA, (Feldmann et al., 
2008)), since we believe its strengths are well suited to the 
problem in hand. We also note that the impressive real-life 
performance in COSMOS described above was achieved with 
two independent template fitting codes (Le Phare, (Ilbert 
et al., 2009) and ZEBRA (Feldmann et al., 2006)). 

This improving photo-z performance described above 
suggests that it may be possible to use the photo-z them- 
selves to construct the N(z), and thus (z) for each bin, pro- 
viding that the systematic uncertainties can be kept below 
the required level of 0.002(l+z). Some spectroscopic calibra- 
tion would of course still be required, but the focus of this 
would be on constructing and characterizing the photo-z al- 
gorithm, rather than on constructing the N(z) directly. 

This approach would have a number of potential advan- 
tages over that discussed by Abdalla et al. (2008) and oth- 
ers, and summarized above. At the very least, the number of 
spectra needed may be substantially reduced, although it is 
unlikely that one would wish to rely entirely on the photo- 
z and eliminate the spectroscopic conformation completely. 
However, to characterize the uncertainties in the photo-z, 
defined by a z , to the required level (<T( Z )) we would need of 
order 

N~(a z /a (z} f (2) 

spectroscopic redshifts, which may be orders of magnitude 
or more smaller than that implied by equation 1 since o z ~ 
0.05Az. 

More importantly, the requirements on completeness 
and sampling are substantially relaxed since the photo-z 
characterization is done on individual objects. As one ex- 
ample, it is relatively easy to simulate the degradation in 
photo-z performance with noisier photometric data, so the 
calibrating spectroscopic objects need not necessarily extend 
all the way down to the photometric limit. The cosmic vari- 
ance problem in spectroscopic calibration is eliminated, and 
the uncertainties arising from Galactic reddening can also 
be substantially reduced. 



1.4 Subject of this paper 

The aim of this paper is to explore the performance of a 
template fitting photo-z code as applied to simulated pho- 
tometric data of the approximate quality that we may real- 
istically expect for a 2tt sr all-sky ground and space survey 
within the next decade. Our emphasis is on both the per 
object performance and on the potential for recognising and 
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correcting systematic biases, which must be done to a high 
level if the increased reliance on photo-z is to be possible. 

As described in more detail in Section 3, we will as- 
sume for definiteness a photometric data set that includes 
the three-band near infrared photometry that is planned for 
Euclid itself, plus 5-band grizy photometry similar to that 
which should be produced by the PanStarrs-1, -2 and -4 
projects (hereafter PS-1, -2 and -4)§. Examining these three 
generations of ground-based survey probes a range of depths 
that can be compared against other future surveys such as 
DES^ and LSStII. 

We then explore the following four topics that poten- 
tially may limit the photo-z performance and their useful- 
ness to construct N(z) and (z): 

(i) The photo-z performance on individual objects in 
terms of the r.m.s. scatter (and bias) between the true red- 
shift and the maximum likelihood photo-z, with particular 
emphasis on how to a priori identify and reject the outliers 
{catastrophic failures) from their individual photo-z L(z) 
likelihood distributions. 

(ii) The construction of N(z) for a given set of photo-z se- 
lected galaxies, using their photo-z alone, with an emphasis 
on how to modify their individual likelihood functions L(z) 
to yield the least biased estimate of N(z) and (2) for the 
ensemble. 

(iii) The systematic biases that can enter into the photo- 
z from an incorrect assumption about the level of fore- 
ground Galactic reddening, and how well the photometric 
data themselves can be used to determine the foreground 
reddening, both for a set of galaxies at known redshifts, and 
for those without known redshifts. 

(iv) The effects of the photometric superposition of two 
galaxies at different redshifts, leading to a mixed spectral 
energy distribution that may perturb the photo-z, with an 
emphasis on seeing what happens to the redshift likelihood 
distribution. An interesting question is whether such com- 
posite objects can be recognised photometrically as well as 
"morphologically" from the images. 

Our approach is to try to isolate these problems and 
to explore each in turn with the aim of providing an exis- 
tence proof that provides a plausible route to achieve the 
very high photo-z performance that is required for Euclid. 
In particular, we decided to construct the input photometric 
catalogues using exactly the same set of approximately ten 
thousand templates as we subsequently used in the ZEBRA 
photo-z code. This may strike some readers as being some- 
what circular. However, this approach allows us to eliminate 
the choice of templates as a variable, or uncertainty, in our 
analysis. This is motivated by the exceptional performance 
(discussed above) that has already been achieved with the 
same templates coupled with the exquisite observation data 
in COSMOS. This strongly suggests to us that the choice of 
templates is unlikely to be the limiting factor with the de- 
graded photometry that we can realistically expect to have 
over the whole sky within the timescale of a decade or so. 



§ http://pan-starrs.ifa.hawaii.edu 
^ http: / / www.darkenergysurvey.org 
II http : / / www .lsst.org 



Although focused on the Euclid cosmology mission, the 
ideas and results from this paper may be of interest in many 
other applications that involve photo-z. 



2 GENERATION OF THE PHOTOMETRIC 
CATALOGUES 

In order to simulate the catalogues for this work, we use 
the COSMOS mock catalogues produced by Kitzbichler & 
White (2007). The mocks are generated from semi-analytic 
galaxy formation models using galaxy merger trees derived 
from the Millennium N-body simulation. The corresponding 
cosmological parameters are Q m — 0.25, £7a = 0.75, fib = 
0.045, h = 0.73, n = 1, and u>a = — 1. The mocks have 
an area of 1.4° x 1.4° and are magnitude limited at Iab < 
26. For each galaxy the catalogues give the right ascension 
RA, declination DEC, the redshift z and the magnitudes for 
the filters Bj, g+, r+, i+, and Ks. The right ascension and 
declination are in the range [— 0.7° , 0.7°] . There are a total 
of 24 mocks, each produced from a different wrapped cone 
that passes through the cubical simulation so that no object 
appears twice in a given cone. A given galaxy will however 
appear at different redshifts in the different mocks. Each 
mock catalogue contains approximately 600,000 objects, the 
majority of which are at z < 1. 

In order to produce a photometric catalogue with our 
own set of filters, we first identified an SED template, from 
the 10,000 available templates, that well matched the given 
Bj, g+, r+, i+, and K 3 photometry for each galaxy in the 
Kitzbichler & White (2007) catalogue, at its known redshift. 
This operation made use of a program kindly provided by 
Thomas Bschorr. These templates include a range of internal 
reddening, which ranges from < A v < 2 magnitudes. This 
chosen template was then used to compute ideal photometry 
(i.e. without any observational noise) for this galaxy in any 
other passband of interest. Intergalactic absorption in each 
template is compensated for using the Madau law (Madau, 
1995). In order to match to the proposed Euclid weak lens- 
ing experiment, we consider only objects with Iab < 24.5. 
Except for the cosmic variance analysis in the Appendix, 
we combine all 24 mocks and use a random sub sample to 
mimic a survey over a large area in the sky. Each of our 
simulations contains at least 100,000 objects. 

To generate a photometric catalogues as observed, this 
ideal photometric catalogue is degraded by adding Gaussian 
noise (in flux) to the photometry according to the three dif- 
ferent sets of survey sensitivities that are listed in Table- 1. 
All three sets contain the nominal near-infrared photometry 
expected from Euclid but have different choices for the depth 
in grizy, and therefore explore the requirements for the 
ground-based complement of the infrared survey. Survey- 
A, which we generally find to be inadequate, uses the point 
source sensitivities from Cai et al. (2009). This is therefore a 
possibly optimistic representation of a PanStarrs-l-like sur- 
vey. Survey-C, which we find exceeds the requirements, uses 
the PanStarrs-4 extended source sensitivities which is cal- 
culated by taking nominal PanStaar-4 point source sensi- 
tivities from Abdalla et al. (2008), degraded by 0.6 mag 
to account for extended sources. The proposed LSST large 
area survey would be expected to be slightly deeper than 
this. Finally, Survey-B is intermediate between these two, 
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Figure 1. The left panel shows the N(z) distribution of the ar- 
tificial catalogue with Jab < 24.5 used in this paper. The right 
panel shows the distribution of internal Ay in the chosen SED 
templates for these galaxies. 



Assumed sensitivities for the representative surveys 



Band 


Survey-A 


Survey- B 


Survcy-C 


g 


24.66 


25.53 


26.10 


r 


24.11 


24.96 


25.80 


i 


24.00 


24.80 


25.60 


z 


22.98 


23.54 


24.10 


y 


21.52 


22.01 


22.50 




Euclid 


NIR 




Y 


24.00 


24.00 


24.00 


J 


24.00 


24.00 


24.00 


H/K 


24.00 


24.00 


24.00 



Table 1. The filter sensitivities for different survey configurations 
considered. The values quoted here are 5<r errors in AB magni- 
tude. 



and approximates what could be expected from PanStarrs- 
2 or the DES. The precise choice of these sensitivities is of 
course somewhat arbitrary, and they are adopted here for 
the sake of illustration. Unless stated otherwise, all magni- 
tudes stated in this paper are in the AB system. 



2.1 Estimating photo-z s using ZEBRA 

In this work the template-fitting photo-z code ZEBRA 
(Feldmann et al., 2006) is used to produce photo-z for the 
galaxies in each of the observationally degraded catalogues. 
ZEBRA gives a single best fit redshift, which we call the 
"maximum likelihood redshift" and template type, together 
with their confidence limits estimated from constant \ 2 
boundaries. ZEBRA also outputs the normalized likelihood 
functions L(z) for individual galaxies in various formats, 
which we also use in this paper. L(z) can be modified by 
a Bayesian prior, as desired, but is in any case normalised 



so that the integral over all redshifts is unity. Further infor- 
mation is available in the ZEBRA user manual.**. 



3 PERFORMANCE ON INDIVIDUAL 
OBJECTS 

In this section, we compare the basic performance of the 
photo-z estimation by comparing the maximum likelihood 
photo-z with the known redshifts of the galaxies, for dif- 
ferent choice of survey depths for the different simulations 
presented in Section 2. The aim of this section is to assess 
what sort of ground-based data is required to complement 
the Euclid infrared photometry and to develop techniques 
for the automated recognition and elimination of outliers 
with wildly discrepant photo-z. 

3.1 Depth of ground-based photometry 

Using the photometric catalogues that were described in the 
previous section, and as degraded to simulate different sur- 
vey configurations, we compare their photo-z performance. 
We first bin the galaxies into narrow redshift bins on the 
basis of their "observed" photo-z. We then use the bias(b) 
and the dispersion a z (z) to parameterize the performance, 
defining these as follows: The error per object (Szi) is 



(z r , 



^phot.i 



(3) 



where 2 rea i,i and z p hot,i are the real and photometric 
redshifts of the ith galaxy. The mean bias in each photo-z 
bin A z (z) is then 



A,(z) = (8z) 



(4) 



The r.m.s. deviation in the photo-z estimation within 
the bin a z (z)) is 



o\{z) = ((S^-Az) 2 ) 



(5) 



and the total mean squared error (MSE) is given as 

MSE(z) = a 2 (z) + A 2 z (z) (6) 

In Figure-2 we show the a z (z) and A z (z) for the differ- 
ent survey configurations. The blue curves in all the panels 
give the initial performance of the photo-z code, without any 
attempt at removing outliers. As expected, increased depth 
in the optical ground photometry increases the reliability of 
the photo-z estimates. However, none of the configurations 
match, without cleaning, the requirement of a z (z)/(l + z) < 
0.05, especially at the lower redshifts z ~ 0.5 where many 
the galaxies in fact lie. The green curves show the effect 
of removing outliers, recognized purely photometrically (see 
Section 3.2), and the red curve shows the effect of modifying 
the individual L(z) as described in Section 3.3. 

3.2 A priori identification of outliers 

In template fitting, a likelihood function L(z) is derived for 
each galaxy from which maximum likelihood photo-z is esti- 
mated. In an ideal case, with a well-defined photo-z estimate, 
the L(z) has a single tight peak. Empirically, it is found that 

** http://www.exp-astro.phys.ethz.ch/ZEBRA/ 
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Figure 2. The overall performance of the survey-A, survey-B and survey-C whose dephts are as quoted in Table-1. The blue lines give 
the performance without cleaning and the green lines after cleaning and the red line gives performance after cleaning and applying 
correction. It is seen that with cleaning and correction <r z (z)/(l + z) < 0.05 is almost reached in all the cases and systematic bias is also 
reduced considerably. Survey-B after cleaning and correction reaches <t z (z)/(1 + z) < 0.05 easily. For survey-A 23% for survey-B 13% 
and for survey-C 9% rejections were made after cleaning. 



many galaxies with poor photo-z estimates have a bimodal 
likelihood distribution. We therefore developed an algorithm 
that searches for bimodality in the likelihood curves of each 
galaxy. If a likelihood function contains more than one peak 
separated by a certain pre-defined redshift difference and if 
the ratio between primary and secondary peaks is above a 
threshold value, then the galaxy is flagged as a likely out- 
lier and can be rejected from the lensing analysis. This pre- 
defined threshold value can be tuned from simulations of the 
kind described here, or from spectroscopic measurements of 
actual redshifts. Of course, this procedure will undoubtedly 
remove some objects whose photo-zs are actually quite good, 
but the lensing analysis is stable to this kind of exclusion. 

After removal of doubtful photo-z, the errors in a z (z) 
and mean bias A z (z) are dramatically reduced, as shown 
by the green lines in Figure-2. The major improvement in 
a z (z) and A z (z) come from rejection of catastrophic failures 
rather than a tightening of the "good" photo-z. As the depth 



of the photometry increases, it is found that fewer objects 
need to be rejected to improve the photo-z estimates. In 
case of survey-A, we find that 23% must be rejected to get 
below o z (z) < 0.05(1 + z), for survey-B it is 12% and for 
survey-C, only 9%. The trade off between beneficial cleaning 
and the wasteful loss of objects determines the robustness of 
the cleaning. After the above cleaning has been performed, 
the fraction of 5a outliers (catastrophic failures) is reduced 
below 0.25% in all the three Surveys (see Table-2). It should 
be noted that we have not taken in to account priors such as 
the size or luminosity of the galaxies, which might further 
improve the performance. 



3.3 Modification of the likelihood functions 

We find that the photo-z estimates can be further improved 
by modifying the L(z) on the basis of a relatively small 
number of spectroscopic redshifts, as follows: First we define 
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a variable P(z) for each galaxy obtained by integrating the 

Hz). 



P(z) 



L(z')dz' 



(7) 



For galaxies where we reliably know the real redshift z rea i 
(e.g. from spectra), we can compute P re ai = P(z rea i). If the 
likelihood functions of the galaxies have statistical validity, 
then the distribution of P rea i for all the galaxies for which 
one has spectra, i.e. N(P rea i), must be uniform between the 
extreme P values of and 1, i.e. 



\Preal*)dP rea l dP rea l 



(8) 



If this is found not to be the case, then one can argue 
that a modification or correction to the individual L(z) is 
warranted. We approach this by constructing a global map- 
ping between P and P' that is determined from all objects 
with reliably known redshifts, such that the distribution of 
Preal wm be flat. We can write 

d P ... ... (g) 



_ dP_ dP^_ 

dz ~ dP> X dz 



Note that 



dP 



N{P real ) 



(10) 



and N(P') = 1 when P = P'. We then modify the 
L(z) for each galaxy, with or without a known redshift, to 
produce L'(z) such that, for all z, 



P'(z) = ( L'{z')dz 
Jo 



(11) 



It is easy to see that this is given by the following simple 
correction to L(z): 



L'(z) = L(z) x N(P(z)) 



(12) 



where N(P(z)) is the "observed" distribution of 
N(P rea i) with applied mapping of z to P for each object 
using equation 11 above. 

This procedure is clearly related to the application of 
a conventional Bayesian prior in redshift space, but is now 
applied in P (probability) space and is based on the absolute 
requirement of having a flat N(P rea i) for meaningful likeli- 
hood functions. The application of the global mapping of P 
to P' to all objects, independent of their nature, is of course 
arbitrary and cannot be rigourously justified. However, we 
find this approach works well, both here and later in the 
paper. 

This procedure is illustrated in Figure-3, in which the 
red line gives the N(P rea i) for all the galaxies with Survey-C 
like sensitivities. We therefore compute an empirical correc- 
tion so as to make the N(P rea i) curve to be flat. The green 
line in (Figure-3 green line) is produced by using a sub sam- 
ple of 1000 galaxies. Note that due to discrete binning in P 
space there is noise introduced in the N(P(z)) function and 
hence the corrected green line is noisy. This noise doesnot 
translate into noise in z space. 

From the peaks of the new likelihood function L'(z), we 
can also define a more accurate individual photo-z estimate 
as given by the red line in Figure-2. We stress that it is not 
necessary to know the N(P) very accurately and an aver- 
age correction of N(P) using a relatively modest number of 
spectra yields significant improvement to photo-z accuracy. 




Figure 3. Normalized N(P) before (red) and after (green) apply- 
ing the correction scheme. In ideal case this distribution should 
be flat in P. 



Percentage of 5cr outliers (feat) in 



Survey 


Before Cleaning 


After Cleaning 


Survey-A 


1.18 


0.2300 


Survey-B 


0.8820 


0.2138 


Survey-C 


0.8221 


0.1776 



Table 2. The percentage of 5<r outliers in various surveys stud- 
ied, feat reduces significantly once cleaning of the catalogue is 
performed, which identifies most of the outliers effectively. 




Figure 4. N(z) constructed from the ^ L(z) function before and 
after cleaning. Here the normalized histogram gives the real red- 
shift distribution in the bin and line is the N(z) constructed from 
the L(z) function. The left panel gives the redshift bin before 
cleaning and right panel gives after cleaning. The constructed 
N(z) clearly traces the catastrophic failures. 
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{ "/f ) for different surveys in the range 0.3 < z < 3.0 



Survey 


Before Cleaning 


After Cleaning After Cleaning - 


- Correction 


Survey-A 


0.1703 


0.0884 


0.0675 


Survey-B 


0.1164 


0.0640 


0.0497 


Survey-C 


0.0876 


0.0492 


0.0398 



Table 3. The {^ff±) for the three surveys studied. After cleaning and correction has been performed survey-B just about reaches 
&z(z)/(l + z) ~ 0.05 Euclid requirements. 
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Figure 5. The bias in the mean of the tomographic bins estimates 
from the Normalized L(z) functions for survey-C and survey-A 
and survey-B. For survey-C, with cleaning for catastrophic fail- 
ures and after applying correction gives |A/^\/(l + z)\ < 0.002. 
Here the shaded region is |A/ Z \| = 0.002(1 + z). We have intro- 
duced a small offset in x-axis values of survey-B and survey-C for 
legibility. 



4 CHARACTERIZATION OF N(Z) FROM THE 
LIKELIHOOD FUNCTIONS 

In weak lensing tomography the photo-z s are used to con- 
struct redshift bins which are then used to calculate the 
lensing power spectrum. The actual N(z) of each bin must 
then be known for quantitative interpretation of the lens- 
ing signal. The mean of the distribution is most important 
parameter (Amara & Refregier, 2007) and we therefore fo- 
cus on this. Generally a single redshift estimator from the 
photo-z code (i.e. the maximum likelihood photo-z) is used 
to construct these bins. However, if using these single red- 
shifts, the A( z ) requirement cannot be reached, as clearly 
shown in Figure-2. This is because the maximum likelihood 
redshifts cannot by construction trace the wings of the N(z) 
that lie outside of the nominal bins, or trace the remaining 
catastrophic failures associated with some of the photo-z. 
Therefore a more sophisticated approach is required. 

As noted in the Introduction, one approach is to un- 
dertake a major spectroscopic survey of large numbers of 
representative objects in the bin and define the actual N(z) 
empirically in this way. As discussed there, there are a num- 
ber of practical difficulties of doing this. 

In this paper we explore a different approach, which is 



to characterize N(z) as the sum of the likelihood functions 
for each redshift bin. We define the mean redshift inferred 
from summing the likelihoods as: 



z S L{z)dz 



and the bias in estimating z rea i as 

A( z ) = Zreal _ Z 



(13) 



(14) 



We apply this approach using the same modification 
techniques described in Section 3.3. The straight sum of the 
original likelihood functions is able to characterize the red- 
shift distribution well, as seen in Figure-4, which shows for 
survey-C the summed L(z) follows (visually) both the the 
catastrophic failures and the wings of the redshift bins well. 
If we apply the cleaning algorithm described above, the num- 
ber of catastrophic failures are removed and wings are con- 
strained more tightly. However, this approach alone is not 
in fact good enough to characterize the N(z) of the bins to 
the required precision of |A( z )j < 0.002(1 + z). 

To characterize the bins more accurately, the L(z) cor- 
rection scheme as described in Section 3.3 was developed. 
We compute N(P) for each redshift bin separately, using 
a spectroscopically observed subsample of 800-1000 galax- 
ies per bin. After correction, the new likelihood functions 
L'(z) for each galaxy, and therefore sometimes a new maxi- 
mum likelihood redshift, is obtained. These are used to rebin 
the galaxies and the sum of the new L'(z) are used to con- 
struct N(z) for the bins. In Figure-5 the bias on the mean 
of the N(z) is given for different redshift bins, and survey 
parameters. The error-bars on each point shows the effect 
of randomly picking different subsets for the the spectro- 
scopic calibration repeatedly. In Figure-5 the shaded region 
gives the Euclid requirement of |A( 2 \/(1 + z)\ < 0.002 on 
the mean redshift of the redshift bins. The black dots are 
for survey-C, which easily reaches the Euclid requirements. 
The red open boxes are for survey-B and it just meets the 
Euclid requirement. The blue stars are for survey-A which 
do not meet the specifications as given by the shaded re- 
gion. From this analysis we conclude that for a Euclid like 
survey, using a survey-B like ground based complement we 
can characterize the N(z) of the tomographic bins to a preci- 
sion of \A {z) /(l + z)\< 0.002 and we need around 800-1000 
random spectroscopic sub-sample per redshift bin to char- 
acterize them. 

The great advantage of this approach is that it sidesteps 
completely the problems associated with the presence of 
large scale structure in the spectroscopic survey fields, since 
the spectro-z are used to characterize, and globally modify, 
the photo-z estimates of individual galaxies, and not to char- 
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Figure 6. The bias introduced in photo-z estimation due to small 
offset in photometry after an average correction for effects of red- 
dening. For survey-A and survey-C simulations two opposite AA V 
offsets were investigated 0.115 and -0.115. Changing the sign of 
AA V leads to a more or less mirror inversion in the bias. 



acterise directly the N(z), which will clearly be affected by 
such structure. It is also less susceptible to incompleteness in 
the measurement of spectroscopic redshifts, either in selec- 
tion for spectroscopic observation or in success in measuring 
a redshift. That said, it is based on an assumption that, for 
a given spectroscopic target at some maximum likelihood 
photo-z, the ability to measure a redshift will not systemat- 
ically depend on the location of the real redshift in P-space. 



5 INTERNAL CALIBRATION OF GALACTIC 
FOREGROUND EXTINCTION 

In this section we explore the effect that errors or uncertain- 
ties in foreground Galactic extinction can have on photo-z, 
and examine whether the photometry of large numbers of 
galaxies, with or without known spectroscopic redshifts, can 
be used to determine an improved extinction map and lo- 
cally correct the extinction. This latter aspect is an exten- 
sion of the iterative adjustment of photometric zero-points 
that is now standard in many template- fitting photo-z codes. 

Extragalactic photometry is routinely corrected for the 
effects of foreground Galactic extinction using reddening 
maps and an assumed extinction curve. In practical terms, 
the effect that we should therefore worry about is an error 
or uncertainty in the Ay, i.e. a AA V , which may be positive 
or negative. This will cause galaxies to be either too red, or 
too blue, in the photometric input catalogue. 

In this section we construct a catalogue containing 10 4 
objects down to Iab ~ 24.5. This mimics a roughly 0.1 deg 2 
region of the Euclid survey. We consider photometry with 
the accuracy expected from both survey-A and survey-C, 
and then perturb these catalogues by applying a standard 
mean reddening law (Cardelli et al., 1989) with a relatively 




Figure 7. The left panel gives estimates of AA V for different 
magnitude bins (in the configuration of Survey-C). The red line 
is obtained using internally computed photo-z, without knowl- 
edge of the redshifts of the galaxies, and the blue line is using 
spectroscopic information. Here we have introduced a small off- 
set in x- axis on the red line for legibility. The right panel is the 
number of objects per magnitude bin. With spectra, at Iab ~ 22 
magnitude bin around 350 spectra are sufficient to estimate AA V 
accurately. 




Figure 8. Performance of the photo-z estimates before and after 
estimated AA V correction have been applied, using this correction 
scheme the systematic bias introduced due to AA V is significantly 
reduced below 0.002(l+z) level. 



large AA V ~ 0.155, in both the positive and negative direc- 
tions. We then compare the photo-z for the galaxies with 
and without these AAy offsets. Figure-6 show the bias be- 
tween these photo-z estimates as functions of redshift. The 
bias fluctuates in redshift in a somewhat random way, with 
large systematic excursions at low redshifts. Interestingly, 
but perhaps not surprisingly, changing the sign of the AAy 
leads to a largely mirror effect on the redshifts, suggesting 
that the response of the photo-z scheme to errors in Ay is 
linear. 

Although we chose quite a large "worst-case" error in 
Av, the biases with redshift seen in Figure-6 are almost 
ten times worse than the 0.002(1 + z) photo-z bias that 
can be tolerated by Euclid's precision cosmology. Large- 
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scale redshift-dependent biases in the photo-z are particu- 
larly worrisome as they mimic the effect of cosmological pa- 
rameters. We therefore explore the possibility of iteratively 
identifying the residual AAy error as follows. We assume 
that we will know the wavelength dependence of the red- 
dening in a given field and that the problem is therefore in 
determining the Ay. At high galactic latitude, we expect 
that the wavelength dependence could be determined from 
very large areas of sky, but that Ay(b, I) may vary on small 
scales. 

To estimate the ability of the photometry to determine 
AAy, we take the input photometric catalogue and "cor- 
rect" it for a wide range of assumed AAy around zero. We 
then run ZEBRA on each of these corrected catalogues and 
take the ^2 Xmin of the all the individual galaxies (i.e. the 
sum of the "best fit" chi-squareds). The value of AA V that 
produces the minimum ^2 Xmin is taken as the best estimate 
of AA V in that region. The exercise can be undertaken with 
the redshifts of the galaxies as a free-parameter, or by as- 
suming that the galaxies have known redshifts, and looking 
at the ^2 Xmin amongst the templates at the known red- 
shifts for each galaxy. The sample used here is magnitude 
limited to Iab < 24.5. 

To obtain an error bar on AAy we compute a reduced 
chi squared (Xr)- For this we need to know the total degrees 
of freedom available in the template fitting approach, which 
is non-trivial since it is unclear how many degrees of freedom 
are associated with the 10,000 templates. We assess this by 
requiring that the Xr De unity (using Survey-C, although 
this should not be important) and find that this gives a di- 
mensionality close to 3, which sounds reasonable given what 
is known about galaxy spectra (see Connolly et al. (1995)). 
A rule of thumb for estimating the uncertainty in one pa- 
rameter gives (Avni, 1976). 

Xlu confidence level Xmin + 1 (15) 

Hence the la uncertainty in AA V estimate is given by the 
values of AA V which are below Xr + 1 / ' DOF values. 

To estimate the effect of photometric noise in estimating 
AA V in this internal way, we consider objects in magnitude 
bins in Iab- The results are shown in Figure-7. We find that 
it is worth using only galaxies with relatively high S/N pho- 
tometry, i.e. with the adopted survey parameters, down to 
Iab ~ 22. Below this level, the estimate degrades apprecia- 
bly. The addition of spectroscopic redshift reduces the error 
bar significantly, but the method is still practicable down 
to the same magnitude limit and yields an error on AAy of 
order 0.01 (with known redshifts) or 0.02 (without). To close 
the loop, we show in Figure-8 the bias in photo-z introduced 
by applying 0.01 error to AAy which shows improvement in 
photo-z errors by a factor of 10. 



late many such blended objects by constructing composite 
spectral energy distributions constructed from galaxies at 
different redshifts, different colours and a wide range of rel- 
ative brightnesses, from dominance of one through to dom- 
inance of the other. For definiteness we look at the photo-z 
behaviour for a survey-C like survey. 

We select several objects from the main COSMOS mock 
catalogues at different redshifts and having different colours 
and normalize their fluxes to have the same Iab bright- 
ness. We then adjust the brightnesses of the two objects by 
±6 magnitudes relative to each other, produce a co-added 
spectrum by averaging the fluxes, and then renormalise the 
resulting composite back to have Iab = 23.5, i.e. one mag- 
nitude above the survey limit. Gaussian noise is then added 
in the usual way to the composite SED to represent the 
Survey-C sensitivities. 

ZEBRA is then run on these set of blended objects 
and the resulting likelihood curves of each composite ob- 
ject, along with a single maximum likelihood redshift, are 
output. In Figure-9, each box represents a single composite 
object consisting of a red galaxy (whose redshift is indicated 
on the left) and a blue galaxy (whose redshift is indicated 
along the top). In each panel, the likelihood curve is plotted 
as a function (vertically) of the adopted magnitude differ- 
ence. 

We see from Figure-9 that if a pair of objects consist 
of two galaxies at similar redshifts then there is a smooth 
transition from one redshift to the other. When the magni- 
tude difference is 2 magnitudes, or more, the redshift of the 
brighter galaxy is returned. When the contrast is lower, the 
maximum likelihood redshift transitions smoothly between 
the two, but will not accurately represent either component. 
When the pairs are at different redshifts i.e. Az > 0.75 the 
returned redshifts still trace the redshift of the brighter of 
the pair for AIab > 2 magnitudes. However, for the region 
AIab < 2, the behaviour is more varied. Bimodality in the 
L(z) is often seen, and sometimes a local maximum at a third 
intermediate redshift. There are often sharp transitions. For 
the classically degenerate redshift pairs i.e. [.25, .5, 2. 5, 3. 5], 
the photo-z generally make a sharp transition between the 
two redshifts. 

Our general conclusion from this analysis is that the 
photo-z of blended objects are trustworthy only if the sec- 
ond component is at least two magnitudes fainter than the 
primary (in a waveband close to the middle of the spectral 
range of the photometry). At smaller magnitude differences 
the photo-z can be corrupted in a way that is not always 
recognizable, and we suggest therefore that these blended 
objects be recognized morphologically from the images, and 
excluded from the analysis. Fortunately, one would probably 
want to do this anyway for a lensing analysis because their 
shape measurements would be hard to use quantitatively. 



6 IMPACT ON PHOTO-Z FROM BLENDED 
OBJECTS AT DIFFERENT REDSHIFTS 

Sometimes multiple galaxies will overlap on the sky, and 
the photometry will be a composite of the two spectral 
energy distributions. Even with spectroscopy, such objects 
have composite spectra rendering their spectroscopic red- 
shift estimation non-trivial. In this final section we explore 
the effects of this blending on photo-z estimates. We simu- 



7 CONCLUSIONS 

In this work we have investigated a number of issues that 
could potentially limit the photo-z performance of deep all- 
sky surveys, and thereby impede the ambitious precision 
cosmology goals of survey programs such as the proposed 
ESA Euclid mission. In each case, we find that, while stan- 
dard techniques do not get to the required accuracy, simple 
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Figure 9. L(z) functions of 42 pairs of blended objects. The objects were boosted to [-6 6] magnitude differences in the I band and 
variation of likelihood functions are observed. Here when the low redshift object is brighter than the high redshift one and the likelihood 
function traces low redshift function. When both the objects are equally bright we see the bimodality in the likelihood functions and as 
magnitude difference increases the likelihood function jumps to trace the high redshift object. The degeneracy of the likelihood functions 
at Am ~ means that the redshift estimation gets completely unreliable at that region. Here the objects are taken for survey-C like 
depth. The objects are chosen such that in each pair one object is red and the other is a blue galaxy. 



new approaches can be developed that appear to get to the 
required performance. 

Knowledge of the redshifts enters into weak-lensing 
analyses at two distinct steps: first, the selection of objects 
for shape cross-correlation, and second the accurate knowl- 
edge of the mean redshifts of these objects. For practical rea- 
sons, the first step will likely require the use of photo-z for 
the foreseeable future. A major motivation for the current 
work has been to develop techniques that rely on photo-z 
also for the second step, bypassing the need for very large 
and highly statistically complete spectroscopic surveys (e.g. 
(Abdalla et al., 2008)). This approach thereby avoids the 
substantial practical difficulties that will be encountered in 
such spectroscopic surveys from incompleteness and the ef- 
fects of large scale structure. A separate Appendix explores 
the latter effects in some detail. 

The work is based entirely on simulated photometric 
catalogues that have been constructed to match the ex- 
pected performance of three generic ground-based surveys, 
combined with the expected near-IR imaging photometry 
from Euclid. To construct these catalogues, we use the same 



set of 10,000 templates as used for the template-fitting 
photo-z program (ZEBRA). This possibly circular approach 
allows us to remove the choice of templates as a variable. We 
believe it is justified at the current time by the very impres- 
sive performance of template-fitting photo-z codes applied to 
the deep multi-band COSMOS photometry, which strongly 
suggest that the choice of templates will not be a limiting 
factor at the required level, although further refinement will 
be desirable. 

The analysis is conveniently summarized in terms of 
the two main requirements on photo-z for precision weak 
lensing, namely to obtain an r.m.s. precision per object of 
< 0.05(l + z) and a systematic bias on the mean redshift of a 
given set of galaxies of < 0.002(1 + z). Our main conclusions 
may be summarized as follows: 

• To achieve an r.m.s. photo-z accuracy of a z (z)/(l + z) ~ 
0.05 down to Iab < 24.5, we need the combination of 
ground-based photometry with the characteristics of Survey- 
B (similar to PanStarrs-2 or DES) and the deep all-sky 
near-infrared survey from Euclid itself. This performance 
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also requires the implementation of an "a priori" rejection 
scheme (i.e. based on the photometry alone, without knowl- 
edge of the actual redshifts of any galaxies) that rejects 13% 
of the galaxies and reduces the fraction of 5cr outliers to 
below f ca t < 0.25%. There is a trade off between the re- 
jection of outliers and the loss of "innocent" galaxies with 
usable photo-z. Deeper photometry improves both the sta- 
tistical accuracy of the photo-z and reduces the wastage in 
eliminating the catastrophic failures, and the combination 
of survey-C with Euclid near-infrared photometry achieves 
<7z(z)/(l + z) ~ 0.04 after 9% rejection. 

• A good way to determine the actual N(z) of a set 
of galaxies in a given photo-z selected redshift bin is to 
sum their individual likelihood functions. We find that the 

L(z) function represents well both the wings of the N(z) 
and the remaining catastrophic failures, outliers. However, 
to reach the required performance on the mean of the red- 
shifts |A (z) | < 0.002(1 + z) with the Survey-B, or deeper 
survey-C, combination (together with Euclid infrared pho- 
tometry), we had to implement a modification scheme on 
the individual L(z). This is based on the spectroscopic mea- 
surement of redshifts for a rather small number of galax- 
ies (less than 1000) with relaxed requirements on statistical 
completeness (and no dependence on Large Scale Structure 
in the spectroscopic survey fields). We then require that the 
distribution of the actual redshifts within the probability 
space that is defined by the individual L(z) should be flat 
across the sample as a whole. This should be true of any set 
of galaxies, leading to relaxed requirements on sampling of 
the redshift survey. This approach is similar to the applica- 
tion of a Bayesian prior on the redshifts, but is performed 
in probability space. Although it cannot be rigourously jus- 
tified, it is found to work well in practice. 

• We find that uncertainties in foreground Galactic red- 
dening can have a serious effect in perturbing the photo-z, 
with a net sign that varies erratically with redshift. However, 
we also find that such errors in A v can be identified inter- 
nally from the photometric data of galaxies, either with or 
without spectroscopic redshifts. This procedure works best 
for galaxies with relatively high S/N photometry Iab < 22. 
The required number of galaxies suggests that a reddening 
map on the scales of 0.1 deg can be internally constructed 
from the data on galaxies without known redshifts, or from 
a few hundred galaxies with spectroscopic redshifts. 

• We also explored the effect of blended objects. The 
photo-z of the composite spectral energy distribution is a 
good representation of the redshift of the brighter object 
as long as the magnitude difference is large, i.e. AIab > 2. 
When the galaxies are more similar in brightness, AIab < 2 
there is a wide-range of behaviour. In some cases, multi- 
modal likelihood functions appear, while in others there is 
a sharp transition from one redshift to another, sometimes 
with a local maximum at a third, completely spurious red- 
shift. In still others, the likelihood function smoothly tran- 
sitions between the two redshifts with a single peak at an 
intermediate redshift. Our conclusion is that composite ob- 
jects with AIab < 2 should be recognized morphologically 
from imaging data and removed from the photo-z analysis. 

The general conclusion of our study is that while reach- 
ing the photo-z performance required for weak-lensing sur- 
veys such as Euclid will not be trivial, the implementation 



of new techniques, coupled with internal calibration of e.g. 
foreground reddening from the photometric data itself, will 
allow the required performance to be attained. If more re- 
liance is placed on the photo-z themselves, then this leads 
to a significant simplification of the otherwise challenging 
requirement for spectroscopic calibration of large scale pho- 
tometric surveys. 
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APPENDIX A: SAMPLING REQUIREMENTS 
FOR DIRECT SPECTROSCOPIC 
CALIBRATION 

In this Appendix, we analyse the number of independent 
fields that are required if one takes the approach of deter- 
mining N(z) for a given redshift bin from spectroscopic ob- 
servations of a representative set of galaxies from that bin. 
The requirement derives from desiring that the effects of 
large scale structure in the galaxy distribution, also known 
as cosmic variance, are at most equal to the Poisson noise 
in determining the error in (z). 
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For the sake of defmiteness, we use the same set of 24 
COSMOS mock catalogues (Kitzbichler & White, 2007) as 
used for the main paper. These are all derived from the same 
numerical simulation (the "Millenium Run" ) but the 24 light 
cones sample this in such a way that the large scale structure 
at any given redshift is independent from one mock to the 
next. These different light cones therefore give a measure of 
the variance that is introduced by large scale structure. 

We compute the effects of this large scale structure on 
(z), the average redshift of a set of galaxies in some defined 
redshift bin as follows. We first look at all the galaxies in 
this redshift bin, across the whole 2 deg 2 field and across all 
of the mocks. The variance of their individual redshifts is 
given by the following expression, 



(Al) 



For a top-hat distribution of redshifts within a redshift 
bin of width Az, this would be given by 



(Az) 2 
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We then imagine carrying out a redshift survey using 
a spectrograph with a particular field of view, which is as- 
sumed to be square. We presume that galaxies (within this 
redshift bin) are randomly selected spatially within this field 
of view. We then compute for each mock i, the average red- 
shift of the n Sj i galaxies that fall within the spectrograph 
field of view. Each field (or mock) yields an estimate of the 
underlying («»), which we denote by £». The variance of this 
quantity across the m mocks is then computed as 

°t = iT,«-o 2 = iCEo 2 -(o 2 (A3) 

In the Poisson case, this variance would be equal to the 
variance coming from the n 3 galaxies in the field, which will 
be approximately given by a 2 /n s , or more precisely by 



= -E 



1 



2 <^> 
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For spectrograph fields of view that are much smaller 
than the mock, we can repeat the exercise for different loca- 
tions of the spectrograph field within the 2 deg 2 field of the 
mocks, averaging the calculations of a 2 . We consider spec- 
trograph fields of view given by 1A/N deg, with N = 1, 6, 
yielding in each case iV 2 different locations with the 2 deg 2 
field. 

The difference between the "observed" a 2 and that from 
the Poisson variance aj> is the contribution of the large scale 
structure, or cosmic variance, <J% V . 

°cv = cr 2 - (J 2 (— -) (A5) 

Given these estimates, we can establish, for each red- 
shift bin and for each spectrograph field of view, a critical 
number of spectroscopic redshift measurements at which the 
cosmic variance will be equal to the Poisson variance. 



(A6) 



As the number of spectroscopic redshifts reaches this 
level, the standard deviation in the average redshift estimate 
£ is already root-2 times higher than would be the case for 



Poisson noise alone, and obtaining more spectra in a given 
field will lead to little improvement in the estimate of the 
global mean redshift (2). 

Figure-Al shows the derived N cr u for galaxies with 
Iab < 24.5 in six representative redshift bins for spectro- 
graph fields of view ranging from a few arcmin (e.g. NIRSpec 
on JWST) up to fields of 1.4 degree, which is approaching 
the largest that is likely practical on an 8-m class telescope. 
For reference, the current VIMOS spectrograph on the VLT 
has a field of about 15 arcmin square. These rather low val- 
ues of N cr it reflect the familiar observational experience that 
the large scale structure in a given survey field starts to be- 
come apparent in the N(z) distribution after only a few 
galaxies have been observed. 

In particular, it can be seen that the value of N cr it 
is usually very much smaller than the average number of 
galaxies within the field of view of the spectrograph, which 
is shown as the dotted line in each panel. The difference be- 
tween these curves indicates the maximum permitted sam- 
pling rate. This emphasizes that the redshift distribution 
from a fully sampled spectroscopic survey is likely to be 
severely cosmic variance limited and that very much lower 
sampling rates are required to keep the effects of large scale 
structure comparable to the Poisson term. For example, a 
survey with the VIMOS spectrograph with a fields of view of 
about 15 arcminutes square would require a sampling rate at 
low redshift of 2% (i.e. one galaxy in 50) or less. Only at the 
highest redshifts and with the smallest field sizes is cosmic 
variance unimportant The low sampling rates demanded by 
this analysis would pose quite severe inefficiencies on the 
utilization of slit-mask spectrographs for such a program of 
spectroscopic calibration of photo-z bins. These would be 
mitigated for fibre-fed spectrographs, although the perfor- 
mance of these at Iab ~ 24.5 has not yet been proven. 
Regardless, it is clear that the survey fields for such a pro- 
gramme would have to be distributed over a significant por- 
tion of the sky. 

The point at which N cr it starts to increase as the square 
of the field of view (i.e. where the solid line becomes parallel 
to the dotted lines) shows the point at which it is safe to 
mosaic adjacent survey fields to build up survey area and 
galaxy number. This occurs at about degree-scales for z > 
1.7. 

If we then take the minimum of N cr u and the available 
number of galaxies in the field, which is almost always given 
by the former, we can then compute the minimum number of 
independent fields that will be required in order to attain an 
uncertainty in the mean redshift of this particular redshift 
bin (2) that is equivalent to the Poisson variance from 10 4 
galaxies. This is shown in Figure-A2. 

If one imagines doing a single survey to cover the entire 
redshift range, then the number of fields will generally be set 
by the lower redshifts, where the effects of large scale struc- 
ture are most severe. Only at very small field sizes does the 
low number density of very high redshift galaxies become the 
limiting factor. It can be seen that about 400 widely spaced 
degree-scale survey fields, 700 VIMOS fields, or about 2000 
NIRSpec (3-arcminute) survey fields would be required. This 
requirement clearly approximates an all-sky sparse sampled 
survey. 

The difficulty of implementing such a scheme was a ma- 
jor motivation for considering the alternative approach to 
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Figure Al. The critical number of redshifts at which the variance in the mean redshift becomes dominated by the effects of large scale 
structure, or cosmic variance. The solid lines give the derived N cr u for Iab < 24.5 in six representative redshift bins for spectrograph 
fields of view from a few arc minutes up to 1.4 degree. The dotted line gives the average number of galaxies within the field of view of the 
spectrograph. The difference between these lines indicates the maximum permitted sampling rate. A fully sampled spectroscopic survey 
is likely to be severely cosmic variance limited and much lower sampling rates are required to keep the effects of large scale structure 
comparable to the Poisson term. 




Figure A2. The minimum number of independent fields required in order to ensure that the uncertainty in the mean redshift of a given 
redshift bin is not dominated by large scale structure or cosmic variance, and is equal to the Poisson variance from 10 4 galaxies. Clearly 
for an all sky survey, the number of required fields will be set by lower redshift range. 
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constructing N(z) which is developed in the main paper. The 
effects of large scale structure are irrelevant in that approach 
which considers photo-z calibration on an object-by-object 
level. 



